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Abstract 

Most previous studies of the sorting algorithm Quicksort have used the number of key 
comparisons as a measure of the cost of executing the algorithm. Here we suppose that 
the n independent and identically distributed (iid) keys are each represented as a sequence 
of symbols from a probabilistic source and that Quicksort operates on individual symbols, 
and we measure the execution cost as the number of symbol comparisons. Assuming only 
a mild "tameness" condition on the source, we show that there is a limiting distribution 
for the number of symbol comparisons after normalization: first centering by the mean 
and then dividing by n. Additionally, under a condition that grows more restrictive as p 
increases, we have convergence of moments of orders p and smaller. In particular, we 
have convergence in distribution and convergence of moments of every order whenever 
the source is memoryless, i.e., whenever each key is generated as an infinite string of iid 
symbols. This is somewhat surprising: Even for the classical model that each key is an 
iid string of unbiased ( "fair" ) bits, the mean exhibits periodic fluctuations of order n. 

1. Introduction, review of related literature, and summary 

1.1. Introduction. We consider Hoare's [13] Quicksort algorithm applied to n 
distinct random items (called keys) Xi, . . . ,Xn, each represented as a word (i.e., 
infinite string of symbols such as bits) from some specified finite or countably infinite 
alphabet. We will consider various probabilistic mechanisms [called (probabilistic) 
sources] for generating the symbols within a key, but we will always assume that 
the keys themselves are iid (independent and identically distributed), and we will 
later place conditions on the source that rule out the generation of equal keys. 

Quicksort (Xi, . . . , X„) chooses one of the n keys Xi, . . . , Xn (called the "pivot" ) 
uniformly at random, compares each of the other keys to it, and then proceeds 
recursively to sort both the keys smaller than the pivot and those larger than it. 

Key observation (coupling): Because of the assumption that the keys are iid, we 
may take the pivot to be the first key in the sequence, Xi. Thus if Xi,X2, ... is 
an infinite sequence of keys and C„ is any measure of the cost of sorting n random 
keys using any cost function c (for example, the number of key comparisons or the 
number of symbol comparisons), then we can place all the random variables C„ on 
a common probability space by using C'n = c{Xi, . . . ,X„). Notice that then C„ 
is nondecreasing in n. We will assume throughout that this natural coupling of 
the random variables C„ has been used. The coupling opens up the possibility of 
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establishing stronger forms of convergence than convergence in distribution, such 
as almost sure convergence and convergence in L^, for suitably normalized Cn- 

Many authors (Knuth [16] , Regnier [19] , Rosier [21] , Knessl and Szpankowski [15] , 
Fill and Janson [5] [6], Neininger and Ruschendorff [18], and others) have studied 
Km the (random) number of key comparisons performed by the algorithm. This is 
an appropriate measure of the cost of the algorithm if each key comparison has the 
same cost. On the other hand, if keys are represented as words and comparisons are 
done by scanning the words from left to right, comparing the symbols of matching 
index one by one, then the cost of comparing two keys is determined by the number 
of symbols compared until a difference is found. We call this number the number 
of symbol comparisons for the key comparison, and let Sn denote the total number 
of symbol comparisons when n keys are sorted by Quicksort. Symbol-complexity 
analysis allows us to compare key-based algorithms such as Quicksort with digital 
algorithms such as those utilizing digital search trees. 

The goal of the present work is to establish a limiting distribution for the nor- 
malized sequence of random variables [Sn — ES'n)//!. Both exact and limiting 
distributions of Sn will depend on the source, unlike for Kn- 

1.2. Review of closely related literature (Quicksort and QuickSelect). Un- 
til now, study of asymptotics for Quicksort's Sn has been limited mainly to the 
expected value ES'„. Fill and Janson [^ were the pioneers in that regard, ob- 
taining, inter alia, exact and asymptotic expressions for E Sn [consult their Theo- 
rem 1.1, and note that the asymptotic expansion extends through terms of order n 
with a O(logn) remainder] when the keys are infinite binary strings and the bits 
within a key result from iid fair coin tosses. (We will refer to this model for key- 
generation as "the standard binary source" . Equivalently, a key is generated by 
sampling uniformly from the unit interval, representing the result in binary no- 
tation, and dropping the leading "binary point".) They found that the expected 
number of bit comparisons required by Quicksort to sort n keys is asymptotically 
equivalent to n In^ n, whereas the lead-order term of the expected number of 
key comparisons is 2nlnn, smaller by a factor of order logn. Now suppose that 
N — {N{t) : < < < oo) is a Poisson process with rate 1 and is independent of the 
generation of the keys, and let S{t) S]^(^t)- The authors also found for each fixed 
1 < p < oo an upper bound independent of t > 1 on the L^'-norm of 

(1.1) nt) := 

(see [7] Remark 5.1(a)] and the corresponding [5] Proposition 5.7]), leading them to 
speculate that Y{t) might have a limiting distribution as t — > oo. We will see that 
a limiting distribution does indeed exist, not only for the standard binary source 
but for a wide range of sources, as well. 

Vallee, Clement, Fill, and Flajolet [23] greatly extended the scope of [3 by es- 
tablishing for much more general sources both an exact expression for E Sn [consult 
their Proposition 3 and display (8)] and an asymptotic expansion (see their The- 
orem 1) through terms of order n with a o(n) remainder. For the broad class of 
sources S considered, the expected number of symbol comparisons is of lead or- 
der .^^nln^n, where h{S) is the entropy of the source (see their Figure 1 for a 
definition) . 
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Building on work of Fill and Nakama j9l , who had in turn followed closely along 
the lines of [7], Vallee et al. [23] also studied the expected number of symbol 
comparisons required by the algorithm QuickSelect(n, m). This algorithm [aka 
Find(ri,, m)], a close cousin of Quicksort also devised by Hoare [12], finds a key 
of specified rank m from a list of n keys. The authors of |23| considered the case 
where m ~ an + o{n) for general a G [0, 1] [note: we will sometimes refer to 
QuickQuant(77,, a), rather than QuickSelect(n, m), in this case] and a broad class 
of sources S. They found that the expected number of symbol comparisons asymp- 
totically has lead term ps{a)n, where ps{c^) is described in their Figure 1. Unlike 
in the case of Quicksort, this is only a constant times larger than the expected 
number of key comparisons, which is well known to be asymptotically K(a)n with 

K{a) := 2[1 - alna- (1 - a) ln(l - a)]. 

For either QuickSelect or Quicksort, a deeper probabilistic analysis of the 
numbers of key comparisons and symbol comparisons is obtained by treating entire 
distributions and not just expectations — in particular, by finding limiting distri- 
butions for suitable normalizations of these counts and, if possible, establishing 
corresponding convergence of moments. Consider QuickQuant(ri, a) first. For both 
key comparisons and symbol comparisons a suitable normalization is to divide by n, 
with no need to center first. For a literature review on the number of key compar- 
isons, we refer the reader to |101 Section 2.2]; the number of symbol comparisons is 
discussed next. 

Fill and Nakama [1^ (see also [17] ) were the first to establish a limiting distribu- 
tion for the number of symbol comparisons for any sorting or searching algorithm. 
They considered QuickQuant(ri,, a) for a broad class of sources and found a limiting 
distribution (depending on a, and of course also on the source) for the number 
Sn{a) of symbol comparisons (after division by n). It would take us a bit too far 
afield to describe the limiting random variable S{a), so we refer the reader to |10[ 
Section 3.1, see (3.7)] for an explicit description. In their paper they use the natu- 
ral coupling discussed in Section [1.1 1 and prove, for each a, that Sn{a)/n converges 
to S{a) both (i) almost surely and, under ever stronger conditions on the source 
as p increases, (ii) in L^. Either conclusion implies convergence in distribution, 
and (ii) implies convergence of moments of order < p. The approach taken in |10| 
is sufficiently general that the authors were able to unify treatment of key com- 
parisons and symbol comparisons and to consider various other cost functions: see 
their Example 2.1. 

Now we turn our attention back to Quicksort, the focus of this paper. Let Kn 
(respectively, S'„) denote the random number of key (resp., symbol) comparisons 
required by Quicksort to sort a list of n keys. We first consider Kn, for which 
we know the following convergence in law, for some random variable T (where the 
immaterial choice of scaling by n + 1, rather than n, matches with |19j): 

1.2 — -^T. 

n + I 

This was proved (i) by Regnier |19j . who used the natural coupling and martin- 
gale techniques to establish convergence both almost surely and in for every 
finite p; and (ii) by Rosier [21) . who used the contraction method (see Rosier and 
Riischendorf [22] for a general discussion) to prove convergence in the so-called 
minimal metric for every finite p [from which (|1.2[) . with convergence of all 
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moments, again follows]. An advantage of Rosler's approach was identifieation of 
the distribution of the limiting T as the unique distribution of a zero-mean random 
variable with finite variance satisfying the distributional fixed-point equation 

(1.3) T = UT +{l-U)T* +g{U), 

with g{u) := 1 -|- 2MlnM -|- 2(1 — u) ln(l — u) and where, on the right, T, T*, and U 
are independent random variables; T* has the same distribution as T; and U is 
distributed uniformly over (0, 1). Later, Fill and Janson g] showed that uniqueness 
of the zero-mean solution C{T) to (|1.3p continues to hold without the assumption 
of finite variance, or indeed any other assumption. 

1.3. Summary. This paper establishes, for a broad class of sources, a limiting 
distribution for the number S'„ of symbol comparisons for Quicksort. We tried 
without success to mimic the approach used in jlOj for QuickQuant. The approach 
used in this paper, very broadly put, is to relate the count Sn of symbol compar- 
isons to various counts of key comparisons and then rely (heavily) on the result 
of Regnier [19]. Like Fill and Janson [IllH], we will find it much more convenient 
to work mainly in continuous time than in discrete time, but we will also "de- 
Poissonize" our result. In the continuous-time setting and notation established 
at (jl.ip (but without limiting attention to the standard binary source), we will 
prove in this paper, assuming that the source is suitably "tame" (in a sense to be 
made precise), that 

(1.4) y(t)^M_^4r 

for some random variable Y. Following the lead of [12] and [TU], we will use the 
natural coupling discussed in Section 11.11 Under a mild tameness condition that 
becomes more stringent as p € [2, oo) increases, we will in fact establish convergence 
in LP; see our main Theorem 13.11 for a precise statement. In particular, for any g- 
tamed source as defined in Remark 2.3(a) — for example, for any (nondegenerate) 
memory less source — we have convergence in for every finite p. Nondegeneracy 
of the distribution of Y is proved by Bindjeme and Fill [T]; thus the denominator t 
used in (|1.4[) is not too large to get an interesting limiting distribution. 



Outline of the paper. After carefully describing in Section [^?T] the probabilistic 
models used to govern the generation of keys, reviewing in Section [22] four known 
results about the number of key comparisons we will need in our analysis of symbol 
comparisons, and listing in Section[2?3]the other basic probability tools we will need, 
in Section [3] we state and prove our main continuous-time result about convergence 
in distribution for the number of symbol comparisons. We extend the result by 
de-Poissonization to discrete time in Section 2] 



2. Background and preliminaries 

2.1. Probabilistic source models for the keys. In this subsection, extracted 
with only small modifications from (10) , we describe what is meant by a probabilistic 
source — our model for how the iid keys are generated — using the terminology and 
notation of Vallee et al. 

Let S denote a finite totally ordered alphabet (set of symbols) , therefore isomor- 
phic to {0, . . . , r — 1}, with the natural order, for some finite r; a word is then an 
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element of i.e., an infinite sequence (or "string") of symbols. We will follow the 
customary practice of denoting a word w = {wi, W2, ■ ■ ■) more simply hy wiW2 ■ • ■ ■ 
We will use the word "prefix" in two closely related ways. First, the symbol 
strings belonging to S'^ are called prefixes of length k, and so E* := Uo<fc<ooS'^ 
denotes the set of all prefixes of any nonnegative finite length. Second, if w = 
W1W2 ■ ■ ■ is a word, then wc will call 

(2.1) w{k) := W1W2 ■ ■ ■ Wk e T,'' 

its prefix of length k. 

Lexicographic order is the linear order (to be denoted in the strict sense by ^) 
on the set of words specified by declaring that w ^ w' if (and only if) for some 
< k < 00 the prefixes of w and w' of length k arc equal but Wk+i < Then 
the symbol-comparisons cost of determining w ^ w' for such words is just fc + 1, 
the number of symbol comparisons. 

A probabilistic source is simply a stochastic process W = Wi W2 ■ ■ ■ with state 
space E (endowed with its total cr-field) or, equivalently, a random variable W 
taking values in (with the product cr-ficld). According to Kolmogorov's con- 
sistency criterion (e.g., [U Theorem 3.3.6]), the distributions /i of such processes 
are in one-to-one correspondence with consistent specifications of finite-dimensional 
marginals, that is, of the probabilities 

Pw ^({w'l • • • Wk} X 1;°°), w = W1W2 ■■ - Wk e S*. 

Here the fundamental probability p^^ is the probability that a word drawn from fi 
has wi ■ ■ -Wk as its length-fc prefix. 

Because the analysis of Quicksort is significantly more complicated when its 
input keys are not all distinct, we will restrict attention to probabilistic sources 
with continuous distributions fi. Expressed equivalently in terms of fundamental 
probabilities, our continuity assumption is that for any w = wiW2 ■ ■ ■ G we 
have Pu)(fc) — >■ as fc — >■ 00, recalling the prefix notation (|2.ip . 

Example 2.1. We present a few classical examples of sources. For more examples, 
and for further discussion, see Section 3 of [23] . 

(a) In computer science jargon, a memoryless source is one with Wi, W2, ■ ■ ■ iid. 
Then the fundamental probabilities have the product form 

Pw = PwiPw2 ■ ■■Pwki W = W1W2 • • • Wfc e S*. 

(b) A Markov source is one for which W1W2 ■ ■ ■ is a Markov chain. 

(c) An intermittent source (a model for long-range dependence) over the finite 
alphabet S = {0, ... ,r — 1} is defined by specifying the conditional distributions 
C{Wj I Wi, . . . , Wj-i) {j > 2) in a way that pays special attention to a particular 
symbol a. The source is said to be intermittent of exponent 7 > with respect to g_ 
if C.(Wj I Wi, . . . , Wj-i) depends only on the maximum value fc such that the last fc 
symbols in the prefix Wi ■ ■ ■ Wj-i are all a and (i) is the uniform distribution on E, 
if fc = 0; and (ii) if 1 < fc < j — 1, assigns mass [fc/(fc -f- l)]"*" to ct and distributes 
the remaining mass uniformly over the remaining elements of E. 

For our results, the quantity 



(2.2) 



TTfc :— max{p^ : w e S'^} 
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will play an important role, as it did in j23l eqn. (7)] in connection with the gen- 
eralized Dirichlet series n(s) := J2k>o''^k'^ ■ particular, it will be sufficient to 
obtain convergence in our main result (Theorem 13. ip that 

(2.3) n(-l/p) = ^7r^/^<oo; 

fc>0 

a sufficient condition for this, in turn, is of course that the source is Il-tamed with 
7 > p in the sense of the following definition: 

DcGnition 2.2. Let < 7 < oo and < A < 00. We say that the source is Il-tamed 
(with parameters 7 and A) if the sequence (TTfc) at (|2.2|) satisfies 

TTk < A{k + for every fc > 0. 

Observe that a Il-tamed source is always continuous. 

Remark 2.3. (a) Many common sources have geometric decrease in tt^ (call these 
"g-tamed") and so for any 7 are Il-tamed with parameters 7 and A for suitably 
chosen A = A^. 

For example, a memoryless source satisfies nk — Pmaxi where 

Pmax :== sup Pu, 

satisfies Pmax < 1 except in the highly degenerate case of an essentially single- 
symbol alphabet. We also have tt^ < pj^j^x ^'^^ ^^V/ Markov source, where now Pmax 
is the supremum of all one-step transition probabilities, and so such a source is g- 
tamed provided Pmax < 1- Expanding dynamical sources (cf. [3]) are also g-tamed. 

(b) For an intermittent source as in Example 12.11 for all large k the maximum 
probability nk is attained by the prefix aj^ and equals 

Intermittent sources are therefore examples of Il-tamed sources for which tt^ decays 
at a truly inverse-polynomial rate, not an exponential rate as in the case of g-tamed 
sources. 

2.2. Known results for the numbers of key comparisons for Quicksort. In 

this subsection wc review four known Quicksort key-comparisons results — the first 
two formulated in discrete time and the next two in continuous time — that will be 
useful in proving our main Theorem 13. II The first gives exact and asymptotic for- 
mulas for the expected number of key comparisons in discrete time and is extremely 
basic and well known. (See, e.g., (2.1)-(2.2) in [5].) 

Lemma 2.4. Let L^n denote the number of key comparisons required to sort a list 
of n distinct keys. Then 

EKn^ 2(n+ l)ff„ -4n 

(2.4) = 2nlnn - (4 - 27)n + 21nn + (27 + 1) + 0{l/n). 

The second result — mentioned previously at (|1.2p — is due to Regnier [T^], who 
also proved convergence in L^ for every finite p. Recall the natural coupling dis- 
cussed in Section [TTTl 
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Lemma 2.5 (|I9]). Under the natural coupling, there exists a random variable T 
satisfying 

(2.5) — T almost surely. 

n + 1 

We now shift to continuous time by assuming that the successive keys are gen- 
erated at the arrival times of a Poisson process with unit rate. The number of 
key comparisons through epoch t is then Kj\[(^t)j which we will abbreviate as K{t); 
while the sequence {Kn) is thereby naturally embedded in the continuous-time pro- 
cess, the random variables K{n) and Kn are not to be confused. We will use such 
abbreviations throughout this paper; for example, we will also write S'7v(t) as S{t). 

The third result we review is the continuous-time analogue of Lemma 12.41 Note 
the difference in constant terms and the much smaller error term in continuous 
time. 

Lemma 2.6 ([H Lemma 5.1]). In the continuous-time setting, the expected number 
of key comparisons is given by 

^K(t) = 2 f {t~y)iery -l + y)y-''dy. 
Jo 

Asymptotically, as i —> oo we have 

(2.6) EX(i) = 2tlni- (4- 27)t + 2 lnt + (27 + 2) + 0(6-4-2). 

The fourth result gives bounds on the moments of K{t). For real p G [1, oo), we 
let llW^llp := {E\W\P)^^^ denote L^-norm. 

Lemma 2.7 ([HJ Lemma 5.3]). For every real p G [l,oo), there exists a constant 
Cp < oo such that 

\\K{t)--EK{t)\\p<Cpt fort>l, 

\\K{t)\\p<Cpt^/P fort<l. 

In the special case p ~ 2, it follows immediately from Lemma [^T71 that 

(2.7) Var K{t) < cjt^ for < t < oo. 

2.3. Basic probability tools. The following elementary lemma is the basic tool 
we will use for iP-convergence. For completeness and the reader's convenience, we 
supply a proof. 

Lemma 2.8. Let Yk{t) be random variables, all defined on a common probability 
space, for fc = 0, 1, 2, . . . and < t < oo. Fix to £ [0, oo) and 1 < p < p' < oo and 
suppose for some sequences (bk) and (b'f.) that 

(i) for each k we have Yk{t) — >■ yfc(oo) almost surely as t — > oo, 

(ii) for each k we have \\Yk{t)\\p < bk for all to <t<oo, 

(ii') for each k we have \\Yk{t)\\pi < b'/^ < oo for all to < t < oo, and 

(iii) Efclo^fe < oo- 
Then 

(a) for each to < t < oo the series X^fcLo ^fe(^) converges in to some random 
variable Y{t), and moreover 

(b) Y{t) r(oo) in LP ast^oo. 
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Proof. We assume without loss of generality that to = 0. Note that hypotheses (ii) 
and (ii') extend to t = oo by Fatou's lemma. 

(a) From (ii) and (iii) it follows for each < t < oo that the sequence of partial 
sums X]^o ^k{t), -ftT = 0, 1, . . . , is a Cauchy sequence in the Banach space and 
so converges to some random variable Y{t). 

(b) Wc first claim for each k that Yfe(t) -> Yfc(oo) in LP, i.e., |Yfc(t)- Yfe(oo)|P 
in as t — >■ oo. To see this, from (ii') it follows using [21 Exercise 4.5.8] that |Yfe(t)|P 
is uniformly integrable in t, as therefore is |yfc(t) — Yfc(oo)|P. Our claim then follows 
from (i), since almost-sure convergence to implies convergence in probability to 0, 
and that together with uniform integrability implies convergence in (e-g-, [H 
Theorem 4.5.4]). 

Using the triangle inequality for L^-norm, the claim proved in the preceding 
paragraph, and the extended condition (ii), wc find for any K that 

oo oo 

limsup||y(t)-y(c»)||p<limsup ^ \\Yk{t) - Yk{^)\\p < 2 ^ bk. 

k=K+l k=K+l 

Now let K ^ oo, using (iii), to complete the proof. □ 

Later fLcmma l3.3p we will transfer Lemma [2751 to continuous time. When we do 
so, the following result will prove useful. This law of the iterated logarithm (LIL) 
is well known, and for example can be found for general renewal processes in |141 
Theorem 12.13]. 

Lemma 2.9 (LIL for a Poisson process). For a Poisson process N with unit rate, 

(2.8) P flimsup4E=i ^ 1, hminf 4E=i ^ -l) ^ L 

V t^oo V2t Inlnt t^°° V2t \n\nt J 

3. Main results (in continuous time): 
Convergence in Lp (and therefore in distribution) 

The following theorem, which adopts the natural coupling discussed in Sec- 
tion 11.11 and utilizes the terminology and notation of Section 12.11 for probabilistic 
sources, is our main result (for continuous time). 

Theorem 3.1. Consider the continuous-time setting in which independent and 
identically distributed keys are generated from a probabilistic source at the arrival 
times of an independent Poisson process N with unit rate. Let S{t) = S'^r(t) denote 
the number of symbol comparisons required by Quicksort to sort the keys generated 
through epoch t, and let 

(3.1) nt) o<t<oo. 



Let p G [2, oo) and assume that 



oo 



(3-2) E ( E 



i/p 

< oo. 



Then there exists a random variable Y such that Y{t) Y in LP. Thus Y [t) ^Y , 
with convergence of moments of orders < p; in particular, Ey = 0. 
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Remark 3.2. (a) Observe that J2wes'' P-^ — ^ '^s.ch k. Thus J2wei:''Pw — 1; 
and condition p.2p grows increasingly stronger as p increases. 

(b) Under the weakest instance p = 2 of the assumption p.2p we have Y{t) ^ Y 
in L^, and so Y{t) —>■ Y in law with convergence of means and variances. The 
random variable Y in Theorem 13. II of course docs not (more precisely, can be taken 
not to) depend on the value of p considered (because a limit in for any p is also 
a limit in probability, and limits in probability arc almost surely unique). 

(c) The expected number of symbol comparisons in comparing two independent 
keys generated by the given source is J2weB'Pi = T^kLoT^we^" pI- So is 
certainly sufficient to imply that ES{t) < oo for every t [in fact, it follows from 
calculations to be performed in the proof of Theorem l3.1l for p = 2 that E S'^(t) < oo 
for every t] and that with probability one S{t) < oo for all t. 

(d) The sum on w in (j3.2p is bounded above by the max-prcfix probability nk 

defined at (|2.2p . and so (|2.3p (namely, X^fc'^'fc^^ < °°) sufficient for (|3.2p . Thus 
from the discussion in Scction l^Tj we see that Theorem 13 . 1 1 gives LP-convcrgcncc for 
Y(t) for all H-tamcd sources with parameter j > p. In particular, for any g-tamcd 
source, such as any (nondegenerate) memoryless source, we have Y{t) ^ Y in 
for every p < oo. 

(e) The standard binary source is a classical example of a periodic memoryless 
source (cf. [23] — specifically. Definition 3(d), Theorem l(ii), and the discussion (ii) 
in Section 3). Fill and Janson [51 Proposition 5.4] show explicitly for the standard 
binary source that 

E S'(i) = In^ i - cit In i + + ntt + 0(log t) as i oo, 

where ci , C2 are explicitly given constants and tt^ is a certain periodic function of 
logi. Given the periodic term of order t in the mean for this periodic source, we 
find it surprising that Theorem 13.11 nevertheless applies. 

(f) We wonder (but have not yet considered): Under what conditions do we have 
almost sure convergence in Theorem 13. II for in the discrete-time Theorem 14. ip ? 

To prepare for the proof of Theorem 13.11 we "Poissonize" Lemma 12.51 

Lemma 3.3. In the continuous-time setting of Theorem \3.U let K[t) = Kj^(j^-^ 
denote the number of key comparisons required by Quicksort. Then for the same 
random variable T as in the discrete-time Lemma \2.5\ we have 
K{t)~'EK(t) 

> T almost surely as t oo. 

Proof This is routine. According to Lemmas 12.51 and 12.41 

K„ - \2n In n - (4 - 2-f)n] 

> T almost surely as n — > oo. 

71+1 

Since N{t) — )• oo almost surely as t ^ oo, it follows that 

m-[2Nit)Xnm^i,^2,)Nm ^ ^ ^^^^^^ ^^^^ 

N{t) + 1 ^ 

Using the strong law of large numbers (SLLN) for N [namely, N{t)/t — > 1 almost 
surely, for which Lemma 12.91 is plenty sufficient], we deduce 

K{t) - [2N{t) In Nit) - (4 - 27)^] ^ ^ almost surely as i ^ 00. 
t ^ 
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From the mean value theorem it foUows that |ylny — xlnxj < — .T|(l+lna; + lny) 
for x,y > 1. Applying this with x = t and y = N{t) and invoking the SLLN and 
the LIL fLGmma l2.9p . wc find almost surely that for large t we have 



\N{t)lnN{t) ~t\nt\ < \N{t) - t\ [I + In N (t) + Int] < VM Inlnt [2 In i + 1 + o(l)] 



= 0{Vthihit X Int) = o(t), 

and so 

— — ^ — — > T almost surely as i — oo. 

t ^ 

The desired result now follows from (12.61) in Lemma [2761 □ 



Wc arc now ready for the 

Proof of Theorem \3.1\ We use an idea of Fill and Janson [SJ Sec. 5] and decompose 
S{t) as X)fc!Lo ^k{t), and each Sk(t) further as J^weT,'' '^■wit), where for an integer k 
and a prefix w € E*^ we define (with little possibility of notational confusion) 

Skit) '■= number of comparisons of (fc + l)st symbols, 

Sw(t) '-^ number of comparisons of (fc + l)st symbols between keys with prefix w. 
A major advantage of working in continuous time is that, 

(3.3) for each fixed k and t, the variables Sw{t) with w G E'"' are independent. 

A further key observation, clear after a moment's thought, is this: For each w € E*, 
as stochastic processes, 

(3.4) {Sw{t) : t e [0,oo)) is a probabihstic replica of {K{pwt) : t G [0,cx))). 
We define corresponding normalized variables as follows: 

^^^^^ Sk{t)-^Sk{t) ^^^^^ S.^{t)-^S^,{t) 

with the normalized variable Y{t) corresponding to S{t) defined at (|3.ip . Then 

oo 

Y{t)=Y,ykit), Yu{t) = Y^it) (fc = 0,l,...). 

fe=o li.esfe 

To complete the proof of LP-convergence of Y{t) we then need only find random 
variables Yfe(oo) such that the hypotheses of Lemma 12.81 are satisfied for some 
p' G (p, oo). [Once we have the main conclusion of the theorem that Y{t) converges 
to Y in , convergence in law with convergence of moments of orders < p follows 
immediately; in particular, since 'EiY{t) = and Ey(i) — > EF we have EF = 0.] 
But, for each w G S*, the existence of an almost-sure limit, call it Yu,(oo), for 
Yu!{t) as t — >■ oo follows from (j3.4p and Lemma [231 indeed, we see that yuj(oo) has 
the same distribution as PwT, with T as in Lemma l3.3l Taking the finite sum over 
w G Tj'^, we sec that 1^(00) can be defined as X^meE'' ^wi^) to meet hypothesis (i) 
of Lemma 12.81 

To verify the remaining hypotheses we choose to = 1 and need to bound the 
L'^-norm ofYk{t) for k a nonnegative integer, t G [l,oo), and q G {p,p'}. According 
to Lemma [374l to follow, for any real q G [2, 00) there exists a constant such that 
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for such k and t. Thus hypotheses (ii) and [for any p' <E (p, oo)] (u') of Lemma [2^ 
hold, and the assumption (|3.2p imphes that (iii) does, as weU. □ 

Lemma 3.4. Adopt the notation in the above proof of Theorem \8.1\ Then for every 
real q G [2,oo), there exists a constant c'^ < oo such that 

lin.(i)ll«<<( E pIY^' 

for every nonnegative integer k and every t £ [1, oo). 

Proof. Fix g g [2,oo). The first step is to use (as did FiU and Janson [H proof of 
Proposition 5.7]) Rosenthal's inequality, relying on the fact [recall p.3p ] that Sk{t) is 
the independent sum of Sw{t) with tn G S'^. According to Rosenthal's inequality pOl 
Theorem 3] (see also, e.g., [TTJ Theorem 3.9.1]) there exists a constant bq (depending 
only on q) such that 



t^Yu{t)\\l = \\Su{t)-^Su{t)\\l 



< bq max<j J2 \\SUt)-'ESUt)\\'J, 



^ \\s^{t)-i:sut)\\ 



Utilizing p.4p and Lemma [2.71 together with the assumptions t > 1 and q > 2 we 
therefore find 



\\Ykitm<bq niax<^ 



9/2' 



< 5, maxi (2c,)« ^ p2„,cl J] pi 



< «)' E 

where ^fc(t) and Bk{t) arc the intersections of those E'^ with {w : p^it > 1} and 
{w : Pwt < 1}, respectively, and 

Cq := 6^'' max{2cg, 02} . 

This completes the proof of Lemma [331 D 



4. Discrete time 

In this final section wc dc-Poissonize Thcorem l3.1l in order to obtain an analogous 
result in discrete time, for which we need to strengthen the hypothesis slightly. 

Theorem 4.1. Let Sn denote the number of symbol comparisons required by Quicksort 

to sort the first n keys generated. Let p £ [2, 00) and assume that for some p' > p 
we have 

(4.1) E ( E pl) < 

k=0 wGT,'' 
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// Y is the continuous-time limit from Theorem \3.1\. then 

(4.2) >¥ as n ^ cx). 

n 

In particular, we have convergence in distribution, with convergence of moments of 
orders < p. 

We will derive Theorem 14.11 from Theorem 13.11 and our proof will need the 
following moderate deviation estimate for N{t). 

Lemma 4.2. For any < e < 1/6, we have 

P{\N{t) -t\> t(i/2)+E) _ /I^-e exp (-^t^") as t oo. 



Proof of Lemma \4-S\ It is well known that the normal approximation gives correct 
lead-order asymptotics for right-tail deviations from the mean starting from a point 
that is, as here, o{t'^/^). Thus if Z is distributed standard normal, then 



P(|7V(t) -t\> ^(1/2)+^) ~ P(|Z| > f) ^ ^^t-' exp (-ii^^) , 

as claimed. □ 

In the following proof, given a sequence of events (Bn), we say that i?„ occurs 
"wvlp" (for ^^with very low probability^^) if P(i3„) is at most an amount exponen- 
tially small in a power of n; we say that B„ occurs "wvhp" (for ^^with very high 
probability^^) if the complement i?^ occurs wvlp. 



Proof of Theorem \4-l\ To prove ()4.2[) from the integer-time consequence 

S{n)--ES{n) 
n 

of Theorem 13. 1[ it is of course sufficient to prove 

(4.3) S{n)-S,.j^^ 

n 

and 

(4.4) ESin)-ES„ ^ 



Further, since (|4.4p follows immediately from (|4.3p . it is sufficient to prove (|4.3p . 

To prove (|4.3p . the key is to recah that S{t) = S'jv(t) where is a unit-rate Pois- 
son process independent of {Sq, Si, . . .) and to note that Sn increases with n. Let 
< e < 1/3. Applying Lemma l42] with (i, e) there taken to be (n -|- e/2), 
wvhp we have 

(4.5) N{n + > + _ _^ ^(i/2)+e)i + ie > 

where the second inequality holds for large enough n. Similarly, wvhp we have 

(4.6) N{n ~ n(i/2)+-) < (n - n(i/2)+e) _ _ „(i/2)+e)i+^- < n. 

Because S. t, it follows from (|L5l) - (|L6)) that 

S(n - n(i/2)-he) < < ^(^ ^ „(l/2)+e) ^yi^p^ 
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and hence wvhp 

|5(n) -Sn\< max{5(n) - S{n - n^^/^^+'), Sin + n'^^^^^+') - S{n)} 
< [Sin) - Sin - + ^_ ^(i/2)+e) _ sin)] 

= S{n + „ Sin ~ n^'/^^+'). 



So to complete the proof of Theorem I4.1l bv proving (|4.3p . it is sufficient to show 
that 



(4.7) 

and 

(4.8) 



5(n + n(i/2)+^)-5'(n-n(i/2)+e) 







where A„ is any event wvlp and 1(A„) is its indicator. Wc prove (a) (|4.8p and then 
(b) 

(a) To bound the L^-norm of the random variable on the left in (|4.8p . we use 
Holder's inequality \\ZiZ2\\i < ||^i||q ||^2||g' with 

Sin) - S„ " 



Zi 



, Z2 = 1(A„)'' = 1(A„), q = ^>l, <7' = ^^>1; 

P P P 



note that (l/g) + (I/9') = 1, as required. Thus 



Sin) - S'„ 



1(^„) 



E 



Sin) - Sn 



< 



Sin) - S„ 



liA„ 



X P(yl„ 



ii-(p/p') 



Because A„ occurs wvlp, it suffices to show that [[^(n)!!^/ and ||5,i||p/ each grow at 
most polynomially in n. 

The first of these two is very easy to handle. Using the hypothesis (|4.ip . we 
know from Theorem 13.11 that 

Sit) -I] Sit) 

t ' 

and it follows that |lS'(t) — E S'(t)|jp' grows at most linearly in t as t — ?> 00. But from 
the first sentence of Remark 13.21 we see that E Sit) grows at most quadratically in t, 
so by the triangle inequality ||>S'(t)||p' grows at most quadratically in t. 

Now we turn our attention to ||S'„||p/. Just as we observed in the preceding 
paragraph that ES'(i) grows at most quadratically in t, we observe here that 

< ^„ < J2 

l<i<j <n 

where Cij is the cost of comparing the zth and jth keys, and hence (witli C :~ C12) 



l<i<j<n 



\C\\p'. 



14 



JAMES ALLEN FILL 



So, to conclude that ||<S'n||p' grows at most quadratically in n, we need only show 
that !|C||p' is finite. Indeed, for any t < oo we have 

C30 > ES{tf' > E \s{t)P' l{N{t) > 2) 

C'P' MNit) > 2)1 = (-ECP') PiNit) > 2) 



> E 

and P{N{t) > 2) > 0, so EC^' < oo. 

(b) It remains to establish (|4.7p . From two applications of Theorem l3.1l it follows 
quickly that 



S{n + n(i/2)+e) -ES{n + n(i/2)+e) 



Y and 



,(1/2)H 



■Y: 



thus it suffices to prove 



(4.9) 



ES'(n + n(i/2)+e) _E5'( 



n ^ n 



(1/2)H 



0. 



Recall from the proof of Theorem 13.11 that 



fc=o toes'" 

and from Lemma 12.61 that we know an explicit formula for E A'(t), namely. 



EX(t) = 2 / {t-y){e-y ~l + y)y- 
Jo 



■dy. 



This function and its increasing derivative, call it d{t), are both easily studied. In 
particular, d{t) ~ i as i J, and d{t) ~ 21nt as t oo. Hence for any < S < 1 
there exists a finite constant as such that 

d{t) < ast^ for aU t G (0,oo). 

Then for any < t < u < oo we have 

oo 

< ES'(m) - E5(t) = E E [^K{p^u) -EK{p^t)] 

k=OweT,'' 

oo 

< (u - t) E E Pwdipwu) < as bs {u - t) 



with 



Therefore, 

E S{n + n(i/2)+^) ^ES{n- ,i(i/2)+e) < 2a,-6,-n(i/2)+e(^ + n^y^^+')' = o(n), 

as desired for (|4.9p . provided i + e + 5 < 1 and 65 < oo. Our proof thus far has 
been valid for any < e < 1/3, but we now restrict to < e < 1/4 and choose 
(5 = i — 2£ S (0, i). The proof of Theorem 14 . 1 1 will be complete once we see that e 
and S can be chosen so that bs is finite. 
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Fix k and recall that "^^^-^k Pw = 1- Let be a random variable with proba- 
bility mass function {pwtW € E*^), and let Z py. Then 

y: pi^' = ez = \\z\\, < \\z\\,/s = {Ez'/^y = (Epy)' = ( ^ pi 

We can arrange ioT S > y by choosing < s < j — which is possible because 
p' >p>2. Then 

^ E ( E ^^-j ' < °° 
by assumption (|4.ip . □ 



Acknowledgment. We thank Svante Janson for excellent suggestions that led 
to improvements to Theorem 13. II 
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